Intuitive visualizations of pitch and loudness in speech

نویسندگان

  • Rebecca S. Schaefer
  • Lilian J. Beijer
  • Wiel Seuskens
  • Toni C. M. Rietveld
  • Makiko Sadakata
چکیده

Visualizing acoustic features of speech has proven helpful in speech therapy; however, it is as yet unclear how to create intuitive and fitting visualizations. To better understand the mappings from speech sound aspects to visual space, a large web-based experiment (n = 249) was performed to evaluate spatial parameters that may optimally represent pitch and loudness of speech. To this end, five novel animated visualizations were developed and presented in pairwise comparisons, together with a static visualization. Pitch and loudness of speech were each mapped onto either the vertical (y-axis) or the size (z-axis) dimension, or combined (with size indicating loudness and vertical position indicating pitch height) and visualized as an animation along the horizontal dimension (x-axis) over time. The results indicated that firstly, there is a general preference towards the use of the y-axis for both pitch and loudness, with pitch ranking higher than loudness in terms of fit. Secondly, the data suggest that representing both pitch and loudness combined in a single visualization is preferred over visualization in only one dimension. Finally, the z-axis, although not preferred, was evaluated as corresponding better to loudness than to pitch. This relation between sound and visual space has not been reported previously for speech sounds, and elaborates earlier findings on musical material. In addition to elucidating more general mappings between auditory and visual modalities, the findings provide us with a method of visualizing speech that may be helpful in clinical applications such as computerized speech therapy, or other feedback-based learning paradigms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modeling the microprosody of pitch and loudness for speech synthesis with neural networks

In this study of Finnish microprosody, two prosodic parameters — pitch and loudness — were modeled with artificial neural networks. The networks are of the general feed forward type trained with backpropagation. For each phoneme, the network predicts a series of either pitch or loudness values on the basis of information of the phoneme’s phonologically motivated features and its phonetic enviro...

متن کامل

Statistical Variation Analysis of Formant and Pitch Frequencies in Anger and Happiness Emotional Sentences in Farsi Language

Setup of an emotion recognition or emotional speech recognition system is directly related to how emotion changes the speech features. In this research, the influence of emotion on the anger and happiness was evaluated and the results were compared with the neutral speech. So the pitch frequency and the first three formant frequencies were used. The experimental results showed that there are lo...

متن کامل

Analysis of Lombard speech using excitation source information

This paper examines the Lombard effect on the excitation features in speech production. These features correspond mostly to the acoustic features at subsegmental (< pitch period) level. The instantaneous fundamental frequency F0 (i.e., pitch), the strength of excitation at the instants of significant excitation and a loudness measure reflecting the sharpness of the impulse-like excitation aroun...

متن کامل

The Function of Pitch Range Variations in Samples of Emotional Expressions in Persian

This study aims at investigating the interface between emotion and intonation patterns (more specifically, duration and pitch amplitude of speech). To this end, the acoustic properties of spectral parameters related to speech prosody are investigated. The results of acoustic and Statistical analysis show that mean level and range of FO in the contours vary strongly as a function of the degree o...

متن کامل

ffects of utterance length and vocal loudness on speech breathing n older adults

Age-related reductions in pulmonary elastic recoil and respiratory muscle strength can affect how older adults generate subglottal pressure required for speech production. The present study examined agerelated changes in speech breathing by manipulating utterance length and loudness during a connected speech task (monologue). Twenty-three older adults and twenty-eight young adults produced a mo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 23  شماره 

صفحات  -

تاریخ انتشار 2016